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This paper presents an algorithm for estimating the weight of a maximum weighted 
matching by augmenting any estimation routine for the size of an unweighted match¬ 
ing. The algorithm is implementable in any streaming model including dynamic graph 
streams. We also give the first constant estimation for the maximum matching size 
in a dynamic graph stream for planar graphs (or any graph with bounded arboricity) 
using 0(n 4//5 ) space which also extends to weighted matching. Using previous results by 
Kapralov, Khanna, and Sudan (2014) we obtain a polylog(?r) approximation for general 
graphs using polylog(n) space in random order streams, respectively. In addition, we 
give a space lower bound of Cl(n 1_£ ) for any randomized algorithm estimating the size 
of a maximum matching up to a 1 + Of) factor for adversarial streams. 


1 Introduction 

Large graph structures encountered in social networks or the web-graph have become focus of 
analysis both from theory and practice. To process such large input, conventional algorithms 
often require an infeasible amount of running time, space or both, giving rise to other models of 
computation. Much theoretical research focuses on the streaming model where the input arrives one 
by one with the goal of storing as much information as possible in small, preferably poly logarithmic, 
space. Streaming algorithms on graphs were first studied by Henzinger et al. (19], who showed that 
even simple problems often admit no solution with such small space requirements. The semi¬ 
streaming model ( 16] where the stream consists of the edges of a graph and the algorithm is 
allowed 0(n ■ polylog(n)) space and allows few (ideally just one) passes over the data relaxes these 
requirements and has received considerable attention. Problems studied in the semi-streaming 
model include sparsihcation, spanners, connectivity, minimum spanning trees, counting triangles 
and matching, for an overview we refer to a recent survey by McGregor m • Due to the fact that 
graphs motivating this research are dynamic structures that change over time there has recently 
been research on streaming algorithms supporting deletions. We now review the literature on 
streaming algorithms for matching and dynamic streams. 
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Matching Maintaining a 2 approximation to the maximum matching (MM) in an insertion-only 
stream can be straightforwardly done by greedily maintaining a maximal matching [16]. Improving 
on this algorithm turns out to be difficult as Goel et al. [18] showed that no algorithm using 
0(n) space can achieve an approximation ratio better than | which was improved by Kapralov to 
[22]. Konrad et al. [25] gave an algorithm using 0(n ) space with an approximation factor of 
1.989 if the edges are assumed to arrive in random order. For weighted matching (MWM), a series 
of results have been published US El HU EH E] with the current best bound of 4 + e being due 
to Crouch and Stubbs HU. 

To bypass the natural Cl(n) bound required by any algorithm maintaining an approximate match¬ 
ing, recent research has begun to focus on estimating the size of the maximum matching. Kapralov 
et al. }23| gave a polylogrithmic approximate estimate using polylogarithmic space for random order 
streams. For certain sparse graphs including planar graphs, Esfandiari et al. [15] describe how to 
obtain a constant factor estimation using 0(n 2 / 3 ) space in a single pass and 0(y/n) space using 
two passes or assuming randomly ordered streams. The authors also gave a lower bound of kl(y/n) 
for any approximation better than §. 


Dynamic Streams In the turnstile model, the stream consists of a sequence of additive updates 
to a vector. Problems studied in this model include numerical linear algebra problems such as 
regression and low-rank approximation, and maintaining certain statistics of a vector like frequency 
moments, heavy hitters or entropy. Linear sketches have proven to be the algorithmic technique of 
choice and might as well be the only algorithmic tool able to efficiently do so, see Li, Nguyen and 
Woodruff [27]. Dynamic graphs as introduced and studied by Ahn, Guha and McGregor Eras] 
are similar to, but weaker than turnstile updates. Though both streaming models assume update to 
the input matrix, there usually exists a consistency assumption for streams, i.e. at any given time 
the multiplicity of an edge is either 0 or 1 and edge weights cannot change arbitrarily but are first set 
to 0 and then reinserted with the desired weight. The authors extend some of the aforementioned 
problems such as connectivity, sparsification and minimum spanning trees to this setting. Recent 
results by Assadi et al. [S] showed that approximating matchings in dynamic streams is hard by 
providing a space lower bound of D(n 2 ~ 3£ ) for approximating the maximum matching within a 
factor of 0(n £ ). Simultaneously, Konrad [ 23 ] showed a similar but slightly weaker lower bound of 
D(n 3 / 2 ~ 4£ ). Both works presented an algorithm with an almost matching upper bound on the space 
complexity of 0(n 2 ~ 2e ) [23] and 0(n 2 ~ 3e ) [5]. Chitnis et al. [8] gave a streaming algorithm using 
0(k 2 ) space that returns an exact maximum matching under the assumption that the size is at most 
k. It is important to note that all these results actually compute a matching. In terms of estimating 
the size of the maximum matching, Chitnis et al. |8] extended the estimation algorithms for sparse 
graphs from m to the settings of dynamic streams using 0(n 4 / 5 ) space. A bridge between dynamic 
graphs and the insertion-only streaming model is the sliding window model studied by Crouch et 
al. m- The authors give a (3 + ^-approximation algorithm for maximum matching. 

The p-Schatten norm of a matrix A is defined as the ^ p -norm of the vector of singular values. It is 
well known that computing the maximum matching size is equivalent to computing the rank of the 
Tutte matrix 


(see also Section 2.1). Estimating the maximum matching size therefore is a 
special case of estimating the rank or O-Schatten norm of a matrix. Li, Nguyen and Woodruff gave 
strong lower bounds on the space requirement for estimating Schatten norms in dynamic streams 
[26]. Any estimation of the rank within any constant factor is shown to require fi(?i 2 ) space when 
using bi-linear sketches and fi(y/n) space for general linear sketches. 


Techniques and Contribution 

Table [l] gives an overview of our results in comparison to previously known algorithms and lower 
bounds. Our first main result (Section [2| is an approximate estimation algorithm for the maximum 
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Reference 

Graph class 

Streaming model 

Approx, factor 

Space 

MM: Greedy 

General 

Adversarial 

2 

0(n ) 

m 

General 

Random 

polylog(n) 

polylog(n) 

m 

Trees 

Adversarial 

2 + £ 

O(Vn) 

m 

Bounded arboricity 

Adversarial 

0(1) 

0(n 2 / 3 ) 

here 

Trees 

Dynamic 

2 + £ 

o( lo f 2 n ) 

here 

Bounded arboricity 

Dynamic 

0(1) 

0(n 4//5 ) 

m 

Forests 

Adversarial 

- — £ 

2 

VL(y/n) 

here 

General 

Adversarial 

1 + 0(e) 

n (n 1-£ ) 

MWM: 03] 

General 

Adversarial 

4 + £ 

0(n log 2 n) 

here 

General 

Random 

polylog(n) 

polylog(n) 

here 

Bounded arboricity 

Dynamic 

0(1) 

0(n 4 / 5 ) 


Table 1: Results for estimating the size (weight) of a maximum (weighted) matching in data 
streams. 


weight of a matching. We give a generic procedure using any unweighted estimation as black box. 
In particular: 

Theorem 1 (informal version). Given a X-approximate estimation using S space, there exists an 
0(A 4 )-approximate estimation algorithm for the weighted matching problem using O(S-logn) space. 

The previous algorithms for weighted matchings in insertion only streams analyzed in mm 
mm extend the greedy approach by a charging scheme. If edges are mutually exclusive, the 
new edge will be added if the weight of the matching increases by a given threshold, implicitly 
partitioning the edges into sets of geometrically increasing weights. We use a similar scheme, but 
with a twist: Single edge weights cannot be charged to an edge with larger weight as estimation 
routines do not necessarily give information on distinct edges. However, entire matchings can be 
charged as the contribution of a specific range of weights r can only be large if these edges take up 
a significant part of any maximum matching in the subgraph containing only the edges of weight 
at least r. For analysis, we use a result on parallel algorithms by Uehara and Chen [32]. We show 
that the weight outputted by our algorithm is close to the weight of the matching computed by the 
authors, implying an approximation to the maximum weight. 

We can implement this algorithm in dynamic streams although at submission, we were unaware 
of any estimations for dynamic streams. Building on the work by Esfandiari et al. m , we give a 
constant estimation on the matching size in bounded arboricity graphs. The main obstacle to adapt 
their algorithms for bounded arboricity graphs is that they maintain a small size matching using 
the greedy algorithm which is hard for dynamic streams. Instead of maintaining a matching, we 
use the Tutte matrix to get a 1-pass streaming algorithm using 0(n 4 / 5 ) space, which immediately 
extends to weighted matching. Similar bounds have been obtained independently by Chitnis et al. 

h- 

Our lower bound (Section [3| is proven via reduction from the Boolean Hidden Hypermatching 
problem introduced by Verbin and Yu [33] . In this setting, two players Alice and Bob are given a 
binary n-bit string and a perfect f-hypermatching on n nodes, respectively. Bob also gets a binary 
string w. The players are promised that the parity of bits corresponding to the nodes of the i-th 
hypermatching either are equal to Wi for all i or equal to 1 — Wi for all i and the task is to find out 
which case holds using only a single round of communication. We construct a graph consisting of 
a t-clique for each hyperedge of Bob’s matching and a single edge for each bit of Alice’s input that 
has one node in common with the t-cliques. Then we show that approximating the matching size 
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Algorithm 1 Approximation of Weighted Matching from | [32j 

Require: Graph G = (V,E = |J- =1 Ei) 

Ensure: Matching 
for i = t to 1 do 

Find a maximal matching Mj in Gi = ( V,Ei ). 

Remove all edges e from E such that e 6 Mi or e shares a node with an edge in M t . 
return |Ji=i 


within a factor better than 1 + 0(l/t) can also solve the Boolean Hidden Hypermatching instance. 
Using the lower bound of H(n 1_1//t ) from [33] we have 

Theorem 2 (informal version). Any 1 -pass streaming algorithm approximating the size of the 
maximum matching matching up to an (1 + 0{e )) factor requires H(n 1_£ ) hits of space. 

This lower bound also implies an fl(n 1_£ ) space bound for 1 + 0(e) approximating the rank of 
a matrix in data streams which also improves the £l(y/n) bound by Li, Nguyen, and Woodruff [263 
for linear sketches. 

1.1 Preliminaries 

We use 0(/(n)) to hide factors polylogarithmic in f(n). Any randomized algorithm succeeding with 
high probability has at least 1 — 1/n chance of success. Graphs are denoted by G(V, E, w) where 
V is the set of n nodes, E is the set of edges and w : E —>• M + is a weight function. Our estimated 
value M is a A-approximation to the size of the maximum matching M if M < \M\ < AM. 

2 Weighted Matching 

We start by describing the parallel algorithm by Uehara and Chen [32], see Algorithm [I] Let 
7 > 1 and k > 0 be constant. We partition the edge set by t ranks where all edges e in rank 
i 6 { 1 , .. ., t} have a weight w(e) £ (y*" 1 ■ , 'y 1 ■ w here w max is the maximal weight in G. 

Let G' = (V , E. w ) be equal to G but each edge e in rank i has weight r t := r f for alii = 1,..., t. 
Starting with i = t, we compute an unweighted maximal matching Mi considering only edges in 
rank i (in G') and remove all edges incident to a matched node. Continue with i — 1. The weight of 
the matching M = (J M t is w{M) = Yll =l r i ' |-^i| and satisfies wq(M*) > wg'(M) > ^ ■ wg(M*) 
where M* is an optimal weighted matching in G. The previous algorithms [TEl 12m [T31IM ITT] for 
insertion-only streams use a similar partitioning of edge weights. Since these algorithms are limited 
to storing one maximal matching (in case of DU one maximal matching per rank), they cannot 
compute residual maximal matchings in each rank. However, by charging the smaller edge weights 
into the higher ones, the resulting approximation factor can be made reasonably close to that of 
Uehara and Chen. Since these algorithms maintain matchings, they cannot have sublinear space 
in an insertion-only stream and they need at least fl(n 2 ~ 3e ) in a dynamic stream even when the 
maintained matching is only a 0(n £ ) approximation ([5]). Though the complexity for unweighted 
estimating unweighted matchings is not settled for any streaming model, there exist graph classes 
for which one can improve on these algorithms wrt space requirement. Therefore, we assume the 
existence of a black box A-approximate matching estimation algorithm. 

Algorithm and Analysis 

In order to adapt this idea to our setting, we need to work out the key properties of the partitioning 
and how we can implement it in a stream. The first problem is that we cannot know w max in 
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a stream a priori and in a dynamic stream even maintaining w max is difficult. However, the 
appropriate partition of an inserted edge depends on w max - Recalling the partitioning of Uehara 
and Chen, we disregard all edges with weight smaller than W uZ x which is possible because the 
contribution of these edges is at most y • = w 2 k x < w l iere OPT is the weight of an 

optimal weighted matching. Thus, we can only consider edges with larger weight and it is also 
possible to partition the set of edges in a logarithmic number of sets. Here, we use the properties 
that edge weights within a single partition set are similar and that g — 7 f° r t wo edges 

e £ Ei and e' £ Ei_\ with i £ {2, ..., t}. These properties are sufficient to get a good approximation 
on the optimal weighted matching which we show in the next lemma. The proof is essentially the 
same as in j[ 32j . 

Lemma 1. Let G = ( V,E,w ) be a weighted graph and e > 0 be an approximation parameter. If a 
partitioning E ±,... ,Et of E and a weight function w' : E —> M satisfy 

1 Vjl (U)(c 1 ^ 

- < — 7 —^ < 1 for all e £ E and —-.—- < 1 + e and w(e) < w(e') 

1 + e ~ w(e) ~ w(e 2 ) ~ 

for all choices of edges ei, e 2 £ E t and e £ Ei, e' £ Ej with i < j and i,j £ {1,..., t} then Algorithm 
[7] returns a matching M = (J* =i Mj with 

2(1 | g)2 ' w{M*) < w'(M) < w(M*) 
where M* is an optimal weighted matching in G. 

Proof. The first property yjy < < 1 for all e £ E implies that U EE1 < w'(S) < w(S ) for every 

set of edges S C E. Thus, it remains to show that 2 (i+ e ) ' W {M*) < w(M) < w(M*). Since M* is 
an optimal weighted matching, it is clear that w(M ) < w(M*). For the lower bound, we distribute 
the weight of the edges from the optimal solution to edges in M. Let e £ M* and i £ {1,... ,t} 
such that e & Ei. We consider the following cases: 

1. e £ M % \ We charge the weight w(e) to the edge itself. 

2. e fL Mi but at least one node incident to e is matched by an edge in M t : Let e! £ M, be an 
edge sharing a node with e. Distribute the weight w(e) to e 1 . 

3. e fL Mi and there is no edge in Mi sharing a node with e: By Algorithm [lj there has to be 
an edge e! £ Mj with j > i which shares a node with e. We distribute the weight w{e) to e!. 

Since M* is a matching, there can only be at most two edges from M* distributing their weights to 
an edge in M. We know that < 1 + e for all choices of two edges e, e' £ Ei with i £ {1,..., t} 
which means that in the case 2. we have w(e) < (1 + e) • w{e'). In case 3. it holds w(e) < w(e'). 
Thus, the weight distributed to an edge e' in M is at most 2(1 + e)w(e'). This implies that 
w(M*) = J2eeM* w ( e ) — Yle'eM 2(1 + e) • w(e') = 2(1 + e) • w(M) which concludes the proof. □ 

Using Lemma [lj we can partition the edge set in a stream in an almost oblivious manner: Let 
(eo, w(e o)) be the first inserted edge. Then an edge e belongs to Ei iff 2 * _1 -w(eo) < w(e) < 2 l -w(eo ) 
for some i £ Z. For the sake of simplicity, we assume that the edge weights are in [1, W], Then the 
number of sets is 0(log W). We would typically expect W £ poly n as otherwise storing weights 
becomes infeasible. 

We now introduce a bit of notation we will use in the algorithm and throughout the proof. We 
partition the edge set E = U,:=o Ei by t + 1 = OflogW) ranks where the set Ei contains all edges 
e with weight w(e) £ [2*,2* +1 ). Wlog we assume Et ^ 0 (otherwise let t be the largest rank with 
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Algorithm 2 Weighted Matching Approximation 

Require: Graph G = (V. U * =0 Ei) with weights rj for edges in E{ 

Ensure: Estimator of the weighted matching 
for i = t to 0 do 
Si = Ri = 0 
iveight = 0, last = t 

Rt = St = Unweighted Matching Estimation(V, E t ) 
for i = t — 1 to 0 do 

Si = Unweighted Matching Estimation(U, £j) 

if Si > Si as t ■ T then > Add current index i to I goo d 

if Si — Siast > c • Riast then > Add current index i to I s i gn 

Ri = Si - S^ st 

last = i 

else 

Si = 0 

t ._. 

return | )T r i ' Ei 

i=0 


Et / 0). Let G' = (V, E. w') be equal to G but each edge e S J?j has weight w\e) = r* := 2* for all 

i = 0 ,..., t. Let M = U- = o ^ be the matching computed by the partitioning algorithm and S be 
a (t + l)-dimensional vector with Si = X)=i |A/j|. 

Algorithm [2] now proceeds as follows: For every i E {0,... t} the size of a maximum matching in 
(V, U t j= i Ej) and 5* differ by only a constant factor. Conceptually, we set our estimator Si of Si to 
be the approximation of the size of the maximum matching of (V, [Jj =l Ei ) and the estimator of the 

contribution of the edges in Ei to the weight of an optimal weighted matching is Ri = Si — 5i+i- 
The estimator Ri is crude and generally not a good approximation to | Mj |. What helps us is that 
if the edges M, have a significant contribution to w(M), then |Mj| Y^j=i+i\Mj\ = •S'j+i- In 
order to detect whether the matching Mj has a significant contribution to the objective value, we 
introduce two parameters T and c. The first matching Mt is always significant (and the simplest 
to approximate by setting R t = St). For all subsequent matchings i < t, let j be the most recent 
matching which we deemed to be significant. We require Si>T ■ Sj and Ri > c- Rj. If both criteria 
are satisfied, we use the estimator Ri = Si — Sj and set i to be the now most recent, significant 
matching, otherwise we set Ri = 0. The final estimator of the weight is Y2i=o r i ' Ei- The next 
definition gives a more detailed description of the two sets of ranks which are important for the 
analysis. 

Definition 1 (Good and Significant Ranks). Let S and R be the vectors at the end of Algorithm^ 
An index i is called to be a good rank if Si 7 ^ 0 and i is a significant rank if Ri 0. We denote the set 
of good ranks by I goo d and the set of significant ranks by I s i gn , i- e., I goo d ■= C { 0 ,... t} |Sj / 0 j 

and I s i gn '■= C { 0 ,... t} \Ri 7 ^ 0 j. We define I goo d and I s i gn to be in descending order and we 
will refer to the Uth element of I goo d and I s i gn by I g0 od(f) and I s i gn {(), respectively. That means 
IgoodiX) IgoodX) > . . . > Igood (I Igood \) and IsigniX) ^ IsignX) > • ■ • > Isign (| Is ign \) ■ ^Ve slightly 
abuse the notation and set I s i gn (\Isign\ + 1) = 0. Let D\ := \M t \ and for I E {2,..., \I s i gn \} we 
define the sum of the matching sizes between two significant ranks IsignX and I s i gn (l — 1 ) where 

the smaller significant rank is included by Du := X^i-T ^ (£)^ * 1 \Mi\- 

In the following, we subscript indices by s for significant ranks and by g for good ranks for the 
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sake of readability. Looking at Algorithm [2] we can proof some simple properties of I goo d and I sign- 
Lemma 2. Let I goo d and I sign be defined as in Definition ^ IJ Then 

1- I good (1 ) — I sign (1 ) — t and I sign L Igood- 

2. For every good rank i g E Igood there is an I E {0,..., \I s ign\} such that I signify > ig > 
Isign{I + 1) and S ig > T ■ Si sign (i)- 

3. For every i s , i! s E I sign with i s < i' s it holds Si g > T ■ S V . 

4- For any i s E I sign and i' s E I sign with i' s < i s it is Rp a > c ■ Ri g . 

5. For any i s E I s i gn and i g E I goo d with i g < i s it is S ig > T ■ S ig . 

Proof. 

1. It is clear that I s i gn C I goo d- Since we assumed that E t 0, there is a nonempty matching in 
Et which means that St = Rt > 0. 

2. Let l be the position of last in I s ign where last is the value of the variable in Algorithm 

[2] during the iteration i = i g . Then I signify > ig > Isign (I + 1) (recall that we defined 

Isign (| Isign | + 1) = 0 ). Since i g is good, it is S ig > T ■ Si as t = Si gign(e) - 

3. Since significant ranks are also good, we can apply 2. to get <5/^ (£+ 1 ) > T ■ S Ig . gn ^ where 
Isignil + 1) < Isignify- By transitivity this implies the statement. 

4. For every i s E I s ign we have Ri g > c-Ri as t where last is the value of the variable in Algorithm 
[2] in iteration i = i s . By definition it is last E I sign and last > i s . Therefore, it holds 
R hig-n{t+ 1) > c ■ R isign(t) fol ' ever y ^ ^ {0,..., \I s ign\ - 1} which implies the statement. 

5. Using 2. we know that S ig > T ■ S Igign ^ for some t E {0,..., \I s i 9 n\}- If is is equal to I signify 
then we are done. Otherwise, we have i s > Isignify an( i we can use 3- to get Si g > T-Sj g . > 

t-K- 

□ 

Now, we have the necessary notations and properties of good and significant ranks to proof our 
main theorem. 

Theorem 1. Let G = (V, E, w) be a weighted graph where the weights are from [1, W}. Let A be an 
algorithm that returns an X-estimator M for the size of a maximum matching M of a graph with 
1/A • \M\ < M < \M\ with failure probability at most 5 and needs space S. If we partition the edge 
set into sets Eq. . .., E t with t = [loglUj where E % consists of all edges with weight in [2*, 2* +1 ) ; set 
ri = 2® ; and use A as the unweighted matching estimator in Algorithmthen there are parameters 
T and c depending on A such that the algorithm returns an O (A 4 )-estimator W for the weight of 
the maximum weighted matching with failure probability at most 5 ■ (t + 1) using 0(S ■ t ) space, 
i.e. there is a constant c such that • w{M*) < W < w(M*) where M* is an optimal weighted 
matching. 
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Proof. In the following we condition on the event that all calls to the unweighted estimation routine 
succeed, which happens with probability at least 1 — <5 • (f+1). The estimator returned by Algorithm 
[^Jcan be written as Y^}f=\" r isi 9 n{i )' Sj sign (j)- Using similar arguments as found in Lemma 4 of [32]. 

we have |• w(M*) < ^ ri\Mi\ < w(M*). Thus, it is sufficient to show that r i s in {i)'Sj si „(£) 

i =0 S3m S13n 

t 

is a good estimator for ^ rj|Mj|. We first consider the problem of estimating D <?, and then how to 

i =o 

charge the matching sizes. 


(1) Estimation of Dj 

Since Uj=i Mj is a maximal matching in (J* =i Ej, S% is a good estimator for Sf. 

Lemma 3. For all i £ {0,..., t} we have — ■ Si < Si < 2 • Si. 

A 

Proof. Let Fj be the set of unmatched nodes after the iteration j of Algorithm [l] Let M* be 
a maximum matching in (V. Uj=/ Ej). Mj is a maximal matching of (V, Ej(Fj)) and therefore 
U j j =i Mj is a maximal matching of (V,|J t j =i Ej). This allows us to apply the bounds of the A- 
approximate estimation algorithm: 

v • Si = \ \ M j\ < t ' \M*\ < |5i| < \M*\ < 2 • ^ \ M j\ = 2-Si- 


A 


A 


j=i 


j=i 


Q 


Next, we show that for an index i g £ I goo d the difference Si g — S Igign ^ to the last significant rank 
is a good estimator for Yli=i g ^ ' \M t \. 

Lemma 4. For all i g £ I good with I s i gn (£ + 1 ) < ig < Isign{£) for some i £ {1, ..., \I s ign\} and 
T = 8A 2 - 2X, 


1 

2A 


Isign(,ty 1 


Isign(,ty 1 


m < s t , - < - • 23 m 


and \\M t \ <S t < 2\M t \. 

Proof. For all i g £ I good with I s i gn {f + 1) < i g < I sign(?) we have 


Isigni,^) 1 

E mi 


Si — Sj . (p\ > - • Si — A • Sj . (f) 

L 9 1 szgny-) — .—, o L 9 1 sign\*■) 

Lem. ^ 


> W • Sj . m-x-s 

Lem. [2] (2) 2 


Isi9nW Lem.ElV 2 


~ - A ) • - • S Isign ^ 


T-2X 

2X 


SI sign (.ti) ’ 


Setting T = 8A 2 — 2A, we then obtain the following upper and lower bounds 


Sin ~ Sj (£) A — ■ Si — 2 • Sj . (j) — — 

L g 1 sign v*v gj ^ L 9 IsignK*-) ^ 


I sign 1 


E mi-MV®. 


5 J . If) 
± szgn V. c / 


1 


Isign(.£) 1 


> - y \Mi\ — (2 — — 
E< 3- □ A V X 


2A 


X T-2X 


Isigni,^) 1 

E mi 


(i) 
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=>■ Si — Sj . (g) = — 

b Q 1 szgn\y) \ 


^signify 1 

i E mi 


2A - 1 


2A 


A / 8A 2 - 4A 


E i m *i 


1 2 
A “ 4A 


I signify 1 


E mi 


IsignW -1 


2A 


E mi 


and 


I sign (^) 1 


*sxgn\yj / 1 \ 

2-S i ,--.S W() =2. £ MI + ^—J-S 


* ^ I signify 


I signify 1 


< 2 V |Mj| + 2 

Ec l- 0 V 


1 


2A 


A J 8A 2 — 4A 


E i m *i 


„ V I sign 1) 1 

2 + s) E mi 

*=*9 

< r E mi, 


where we used A > 1 in the last inequality. Since \M t \ = St and Rt = St, the last statement follows 
directly from Lemma |3j D 

From Lemma [2j 1 we know that I sign E I good, which together with the last Lemma [4] implies that 
Risignify a g°°d estimator for Dp. 

Corollary 1. For £ e {1,..., \I s i gn \}, ^ • Di < < | • L^. Furthermore, if c > 5A then the 

values of the Dg are exponentially increasing: 


5A 


D\ < —L>2 < ■ • • < I — 


\ 1 1 sign | 1 


c / 


D\ 


1 1 sign | 1 * 


Proof. Recall that for £ € {2,..., |/ s ign|} we defined Z?£ = (fE ^ or ^ = 1 the value 

of Ri sign {\) = /S'* is a good estimator for the size of the matching M t (which is equal to D\) due to 

Lemma 3 Since for £ E {2,..., \I s ign\} it is R h m = Si sign (e) - Si aign (i-i) and I s i gn Q Igood , the 
first statement is a direct implication of Lemma El by setting i g = I s i gn (£). 

For three adjoining significant ranks I s i gn ^+ 1), I s ign{£), Isign{£~ 1) with £ E {2,..., \I S i gn \ ~ 1}, 
we have 


1 


Isign 1) 1 


Isignify— 1) Rlsignify 


2A ' D( 2A l Mi l r E m ' S '- 

i—I signify ^ 

< ~ ' ^Lion(t+l) = “ ' ( <5 

Lem. [2] (4) C C 


5 W(m) - 5 / sign ify 


r I signify 1 r 

- E = 


Lem. [4] 2c 


i—Isign (t - ! - !) 
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Since D\ = \Mt\ and Ri sign ( 1 ) = Rt = St, Lemmajijalso implies that 

_ g Isign{ 1) — 1 

_ . Di < _ . Di < < _ . i? W2) < - ^ E l M *l = ' ° 2 - 

i I sign (2) 

Thus, for c > 5A the values of the Di are exponentially increasing: 

„ 5A / 5A \l4i S u|-i 

D l< —L>2< ■■■< (—) D \I aign \~l- 


D 


(2) The Charging Argument 

We show that the sum of the matching sizes between two significant ranks I S ign{& + 1) and I s i gn (£) 


is bounded by 0 (A • T ■ Dp) = O ( A • T • X •= 


■T sign (<-!)+! 
J i = Isign (^) 


I Mi 


Lemma 5. Setting c = |-T+5A in Algorithm 


1signi'£) 1 


2. Then for l E {1,..., \I sign \-l}, E \M, 

i = Isign^~\~^)~\~^- 


t| < 


I sign design I) 1 

(2A • T + 25A 2 ) • Dp and |Mj| < (2A • T + 25A 2 ) • D\j s . gn \ if 0 ^ / S i gn - 

i=0 

Proof. For the proof of the first inequality, let i g E I goo d be minimal such that I S i gn (£ + 1) < i g < 
I sign (£) for £ E {1,..., \I s i gn \ ~ !}■ If such a good rank does not exist, set i g = — 1. We distinguish 
between two cases. Note that c = I • T + 5A > 5A. 


Case 1 : i g = I s i gn (£ + 1) + 1. For the sake of simplicity, we abuse the notation and set Sj sign ( 0 ) = 0 
such that Rt = St . ip\ — Sr . also holds for i = 1. Using Lemma 4 we have 

J sign\y) J -sign\^j ± sign\** x ) 0 


hignd )-1 

E Ml 

* = Lxgn(^+l)+l 


LignW-1 

S |M - l L,l. n 2A '(^- S ^) 




2Ac • R hign(t) ~ 2A ‘ C ' ( ‘S'UignW _ ^Lign^-l) 


Lign(^-1)-1 


< 5A • c • V' | Mi | = 5 ■ c ■ Dp 

LemM i=Isign(£) 


( 2 ) 


Case 2: i g / +!) + !■ In this case <Sj sign (*+i) + i < T ■ S Isign ^y Thus 


hign(l )-1 

E i m *i ^ 

z=7 s i 5ri (£+l)+l 


Js*flrn(^+1) + 1 — ^/stgn^+lj + l 

Lem. |3J 


< \-T • Sjr . ( £) < 2A • T • Sj . m = 2A • T • V A 

— 1 sign\<') T — m 1 sign\<') / _j L 

Lem. 131 .. 

2=1 


( 5A 


< 2A • T • D £ • V — < 2\ • T • De ■ - 

coT.ra - i 


1 


5A 

c 


(3) 
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Combining the inequalities j^j and jij we have J2i=i n ^\e+ i)+i 1-^*1 — 

max j^5A ■ c, 2- ^ j* ' A which simplifies to 


I sign (^) 1 


El \Mi\ < (2A • T + 25A 2 ) • Di for i G {1, ..., \I S ign\ ~ !}■ 

i = Isign(£-\- 1) + 1 

If 0 0 I S ign we can do the same arguments to bound ^*"= 0 "^”*"^ 1 I A/j| by (2A • T + 25A 2 ) • D\i aigti \ ■ 
Let i g G I good be minimal such that 0 < i g < I S ign(\Isign\) ■ Again, we distinguish between two 
cases. 


Case 1: i g = 0. Using Lemma [4] we have 


^ Isign design |) 1 

— • E \ M i\ T ^ m S 0~ S I s i Bn (\I sig n\) 
i =0 


2A 


Lem. [4] 

< c ■ R 

00s 


Riga, design |) ^ ^Isign(\Isign |) ^ ^sign (| I sign 1) 


-Lign (l-Lign | 1) 1 


< - • c ■ 

Lem. [3] 2 


E m = -- c-D 


| 1-sign | 


I sign i\Isign |) 


Isign design |) 1 

^ E 1-^*1 — 5A • c • D\ Isign \ 


i =o 


Case 2: 0. In this case Sq <T ■ Si sign (\i sign \)- Thus 

I sign (| I sign |) 1 


1 

A 


E l M *l ^ T ■ So < So <T ■ S Isign Q Isign \) 


i=0 


Cor. [Tj 

I sign (\Isign |) 1 

o E i m *i ^ 

i=0 


A Lem. [3] 


| Isign | 


< 2 -T-S 

Lem. [3] 


Isigni\Isign |) 

| Isign | 


2 -T- E A 


2=1 


r 2-r-D|,„„ r £ (^)‘< 2 .r.D l 


2=1 


l-^sign | 


^ 9^1 ^ _ 5A 


2A-T 


^ _ O A 

C 


5A ^\Isign\' 


Now, with the same c = ^ • T + 5A as before we have 


Isign (\Isign | ) 1 

E |M j |<(2A.T + 25A 2 ). j D| W i |. 

i=0 


□ 
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We use Lemma pRo show that w(M) is bounded in terms of Y^e= -l^ r i S i gn (Z) ' Df- 


t 


5> ■ | Mi 

i =0 
t 


i =0 


1 1 sign | 

- ZZ r higu{t) • D i 

i=\ 

I Isign | 

< (1 + 2A • T + 25A 2 ) • £ r IsignW -Dt. 

l=i 


(4) 

(5) 


Putting Everything Together 

Using Corollary [I] we have t h ■ Di < R 
l 


< | • D( for all £ G 


L sign 


|} which with (4) 

and Q gives 2X<1+2 It+ 25 X') ' W ( M ) - ifii" 1 r U,,n(L ■ < 1 ■ ™(M). Recall that we set 

T = 8A 2 — 2A. Now, folding in the factor of i from the partitioning and rescaling the estimator 


gives an 0(A 4 )-estimation on the weight of an optimal weighted matching. 


□ 


2.1 Applications 

Since every edge insertion and deletion supplies the edge weight, it is straightforward to determine 
the rank for each edge upon every update. Using the following results for unweighted matching, 
we can obtain estimates with similar approximation guarantee and space bounds for weighted 
matching. 


Random Order Streams 

For an arbitrary graph whose edges are streamed in random order, Kapralov, Khanna and Su¬ 
dan [23] gave an algorithm with polylog n approximation guarantee using polylog n space with 
failure probability 5 = 1/polylog n. Since this probability takes the randomness of the input per¬ 
mutation into account, we cannot easily amplify it, though for logVU < 6, the extension to weighted 
matching still succeeds with at least constant probability. 


Adversarial Streams 


The arboricity of a graph G is defined as 


\E(U)\ 

\ U \-1 


. Examples of graphs with constant arboricity 


include planar graphs and graphs with constant degree. For graphs of bounded arboricity u, 
Esfandiari et al. [15] gave an algorithm with an 0(v) approximation guarantee using 0(u ■ n 2 / 3 ) 
space. 


Dynamic Streams 

We give two estimation algorithms for the size of a maximum matching. First, we see that it is easy 
to estimate the matching size in trees. Second, we extend the result from m where the matching 
size of so called bounded arboricity graphs in insertion-only streams is estimated to dynamic graph 
streams. 


Matching Size of Trees Let T = ( V., E ) be a tree with at least 3 nodes and let hr be the number 
of internal nodes, i.e. nodes with degree greater than 1. We know that the size of a maximum 
matching is between hx /2 and hx- Therefore, it suffices to estimate the number of internal nodes 
of a tree to approximate the maximum matching within 2 + e factor which was also observed in 
[f5j • In order to estimate the matching size, we maintain an £o-E s ti ma t° r for the degree vector 
d € such that d v = deg(v ) — 1 holds at the end of the stream and with it ^o(^) = hr- In 
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other words, we initialize the vector by adding —1 to each entry and update the two corresponding 
entries when we get an edge deletion or insertion. Since the number of edges in a tree is N — 1, the 
preprocessing time can be amortized during the stream. Using Theorem 10 from Kane et al. [21]. 
we can maintain the ^"Estimator for d in 0(e -2 log 2 N) space. 

Theorem 3. Let T = (V, E) be a tree with at least 3 nodes and let e G (0,1). Then there is 
an algorithm that estimates the size of a maximum matching in T within a (2 + e)-factor in the 
dynamic streaming model using 1 -pass over the data and 0(e~ 2 log 2 IV) space. 

As in 115] this algorithm can be extended to forests with no isolated node. 


Matching Size in Graphs with Bounded Arboricity The algorithm is based on the results from 
m- Since we need parametrized versions of their results, we summarize and rephrase the ideas 
and proofs in this section. Let G = ( V ., E) be a graph. The arboricity a(G) of G is a kind of density 
measure: The number of edges in every induced subgraph of size s in G is bounded by s ■ a(G). 


Formally, the arboricity a(G) of G is defined by a(G) = max 
the average degree of every induced subgraph of G then p,Q < 2 • a(G). 


\ E (U)\ 

I u\-l 


If /ic is an upper bound on 


Definition 2 ( [T5] ). A node v G V is light if deg{v ) < C with C = \hg\ +3. Otherwise, v is heavy. 
An edge is shallow if and only if both of its endpoints are light. We denote by he the number of 
heavy nodes in G and by sg the number of shallow edges in G, respectively. 

Using the results from Czygrinow, Hanchowiak, and Szymanska [12J (and C = 20 a(G)/s 2 ) it is 
possible to get a 0(a(G )) approximation for the size of a maximum matching by just estimating 
he and sq- Esfandiari et al. m improved the approximation factor to roughly 5 • a(g). 

Lemma 6 (]15j). Let G = (V,E) be a graph with maximum matching Al*. Then we have 
max{fe G ,s G } |^j*| <h G -\- SG where 7] = 1.25C + 0.75 where C is at most |~2a(G) + 3]. 

Estimating he and sq is possible by random sampling: For heavy nodes, we randomly draw 
a large enough set of nodes and count the heavy nodes by maintaining their degree. Rescaling 
the counter gives a sufficiently good estimate, provide Iiq is large enough. For sq we randomly 
draw nodes and maintain the induced subgraph. For each node contained in the subgraph it is 
straightforward to maintain the degree and thereby to decide whether or not a given edge from the 
subgraph is shallow. Then we can rescale the counted number of shallow edges which gives us an 
estimation on sg if sg is large enough. Dealing with small values of sg and he, Esfandiari et al. 
additionally maintain a small maximal matching of size at most 7i a with a < 1. If the maintained 
matching exceeds this value then we know that either sg or he is greater than n a /2 by Lemma 
[6] and the estimation of the parameters he and sg will be sufficiently accurate. The main tool 
to extend this algorithm to dynamic graph streams is to estimate the size of a small matching by 
means of the Tutte matrix. But first, we restate the following three lemmas from m for arbitrary 
parameters and extend them to dynamic streams. 

Lemma 7. Let T be an integer and e < l/\/3. Then there exists a 1-pass algorithm for dynamic 
streams that outputs a value h which is a (1 ± e) estimation of he if he >T and which is smaller 
than 3 T otherwise. The algorithm needs O ^ lo f a n ■ space and succeeds with high probability. 

Proof. The probablity of sampling a heavy node is Hence, sampling a set of nodes S gives us 
|Sj • ^ heavy nodes on expectation. Set \S\ = 3 *°f n If. For each node v G S we maintain its degree 
using O(log IV) space. We define the indicator variable X v with v G S which is 1 if v is heavy and 
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0 otherwise. Then our estimator for h G is h = X v which is equal to h G in expectation. First, 
assume h G > T. Then using the Chernoff bound we have 


h > (1 + e) • E 


r 1 


h 

= P 

L J 



^X v > (1 + e) ■ E 


.veS 

3 log n n h G e 2 \ 1 

< exp-—— • — •—)<—. 


T n 


n 


The same bound also holds for 
bound gives us 


h < (1 — e) ■ E 


. If h G < T, then again using the Chernoff 


n 




E x - ^ 3T 

WGS / 

3T • | S'| • h G 


> 




Lues 

< exp (— 

< exp 


n ■ h G 


3 T 


E ^>( 1 + ^- 1 )-e 


3 log n n h G % ~ 1 


.veS 


■? T 


n 


2T ' 


3 log n n hg \ < 1 
e 2 T n 2 / — n 


where the last inequality follows from e < 

V 3 


□ 


Lemma 8. Let T be an integer and e < l/\/3. Then there exists a 2-pass algorithm for dynamic 
streams that outputs a value s which is a (1 ± e) estimation of s G if s G > T and which is smaller 
than 3 T if s G < T. The algorithm uses O / “(G)-»jog - J space and succeeds with high probability. 


Proof. In the first pass, we sample 31 °f 71 a ^ 11 edges uniformly at random using samplers, each 
of which cost at most 0(log 3 n ) space [20]. For each node of a sampled edge, we maintain its degree 
in the second pass to decide whether a given edge is shallow or not. Hereafter, we reapply the 
analysis of Lemma [7J Let S = (e i,..., e|s|) be the sequence of sampled edges in the first pass and 
let X t be the indicator variable which is 1 if and only if e* is shallow. The probability of sampling 
a shallow edge is which implies that E [^Xj] = |/f>| ■ jjjh > |5| • a ^. N ■ Now, let s' = 
be our estimator. We know that E [s] = s G . If s G >T then by Chernoff we have 

P[s > (1 + e)-E[s]] = P [E X * - (l + £ )’E [E X) 

< exp 


3 log na(G)-n s G £ z \ 1 


e 2 T o,(G) ■ n 3 j n 

The same bound also holds for P [s < (1 — e) • E [s]]. If s G < T, then again using the Chernoff 
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bound gives us 


M 

.\s\ 

E- Y * 


Y, Xi ) ^ 3T 

3T- |S| • s G 
\E\-s g 


> 


3 T 


£*> 1 + --1 

^ \ s G 


E 


[E 


X, 


< exp 


31ogn a(G) • n sq 


T 


a(G) ■ n 


< exp — 


31og?z a(G) • n sg 


3T _ ' 

SG 


2 T 


a(G ) • n 


2 ~ n 


where the last inequality follows from e < f|| 

Lemma 9. Let e > 0 and T > (16C/e) 2 be an integer. Then there exists a 1 -pass algorithm for 
dynamic streams that outputs a value s' which is a ( lie) estimation of sg if sg > T and which 
is smaller than 3 T if sg < T. The algorithm uses O space and succeeds with constant 

probability. 

Proof. Let S' be a set of —fj= randomly chosen nodes. We maintain the entire subgraph induced 
by S and the degree of each node in S. Note that the number of edges in this subgraph at the end 
of the stream is at most a(G) • |S|. Since we have edge deletions this number may be exceeded at 
some point during the stream. Thus, we cannot explicitly store the subgraph but we can recover 
all entries using an a(G) ■ |S|-sparse recovery sketch using 0(a(G) ■ |S|) space (see Barkay et al 
[7]). Let ei,..., e SG be the shallow edges in G. Define = 1 if e* € E(S) and 0 otherwise. X t is 
Bernouilli distributed where the probability of both nodes being included in the subgraph follows 
from the hypergeometric distribution with population n, 2 successes in the population, sample size 
| S'| and 2 successes in the sample: 

(1) (g|-l) |S| • (|S| -1) „ |S| 2 8 

(|”|) n ■ (n — 1) — 2 n 2 e 2 T' 

Hence Xj is Bernoulli distributed, we have Var [Xf = p ■ (1 — p) < p. We know that Var Xf = 

Var [W] + Cov [Xi, Xj\. For the covariance between two variables X. L and Xj we have 
two cases: If e,- and ej do not share a node, then Xj and Xj cannot be positively correlated, 
i.e. Cov [Xj,Xj\ > 0. To be more precise, we observe that by definition Cov [Xj, Xj] is equal to 
E [XjXj\ — E [Xj\ ■ E [Xj] which is equal to P [Xj = Xj = 1] — p 2 . The probability P [Xj = Xj = 1] 
is equal to the probability of drawing exactly 4 fixed nodes from V with a sample of size |S| which 
is 

(iKgr-f) |S|.(|S|-l).(|S|-2).(|S|-3) 

(| 5 |) n • (n - 1) ■ (n - 2) ■ (n - 3) 

Since | for a < b and c > 0, this probability is at most p 2 which means that the covariance 

is at most 0. If e* and ej share a node, we have 

Cov [Xj,Xj] < P [Xj = Xj = 1] 

(JHism) |5|.(|S|-1)-(|S|-2) , /2 

(| 5 |) n- (n- 1) • (n-2) 
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By definition each node incident to a shallow edge has at most C neighbors and therefore, we have 
at most 2 C edges that share a node with a given shallow edge. In total, we can bound the variance 
of X 


Var [X] 


= Y Var [Xi] + Y Cov [X t , Xj] 

i¥=j 

< p- SG+ Y Cov - P ■ SG + 2C • s G ■ p 3/2 <2 p- s G 

j 

ei,ej share a node 


where the last inequality follows from yjp < = and T > (16C/e) 2 . Using Chebyshev’s 

inequality we have for sq > T 


- ■ X - -E [X] 
P P 


> e ■ -E [X] 
P 


= P[|X-E[X]| > e-E[X]] 

^ Var[X} 2p ■ s G 2 

“ e 2 E [A'] 2 “ e 2 p 2 • s 2 G ~ e 2 Tp 
2 e 2 T _ 1 
“ 8 e 2 T ~ 4' 


If sg < T, we have E [X] = p ■ s G < pT. Thus, it is 


P 


1 

P 


• X > 3T 


= P [X - E [X] > 3Tp — E [X]] 

< P[|X-E[X]|>2 Tp\ 

VarlX} 2 p-s G 2 2 e 2 T e 2 1 

- 4T 2 p 2 - 4T 2 p 2 - 4Tp - 16T 16 - 16 


□ 


Algorithm 3 Unweighted Matching Approximation 

Require: G = (V,E) with a(G) < a and e 6 (0, l/\/3) 

Ensure: Estimator on the size of a maximum matching 

Set T = n 2 / 5 for a single pass and T = n 1 / 3 for two passes and p = 2.5 [2 • a + 3] + 5.75. 

Let h and s be the estimators from Lemma @ and Lemma [ 9 ] 
for 7 = 0,..., log 3 T /(I — e) do 

Solve rank decision with parameter k = 2* on the Tutte-Matrix T(G) with randomly chosen 
indeterminates 

if rank(T(G)) < 3T/(1 — e) then 

Output the maximal 2* +1 for the maximal i € {0,..., 2 log3T /( 1-£ )} with rank(T(G')) > 2* 
else 

^ max{ h,s } 

Output -— 

(l + e) V 


Algorithm [3] shows the idea of the estimation of the unweighted maximum matching size in 
bounded arboricity graphs using the previous results and the relation between the rank of the 
Tutte matrix and the matching size. 

Theorem 4. Let G be a graph with a(G) < a with n > (16a/e) 5 . Let e E (0,1 /\/3). Then there 
exists an algorithm estimating the size of the maximum matching in G within a 2 U+e)(5n(G)+0( 1 )) _ 
factor in the dynamic streaming model using 
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• a single pass over the data and 0{ a ' r ft / ’ ) space or 

• 2 passes over the data and 0(a ■ n 2 / 3 ) space. 


Proof. For the sake of simplicity we assume that 3T/(1 — e) is a power of two. We know that we 
can decide the rank decision problem with parameter k in a dynamic stream with one pass using 
0(k 2 logn) space by Theorem 5.1 of Clarkson and Woodruff [9]. Thus, invoking this algorithm for 
k = 2°, 2 1 ,..., 2 log ' 3T /l 1_£ ) results in a space requirement of 0(T 2 • logT • logn) = 0(T 2 log 2 N ) 
for our choices of T. For the first part of the theorem, we estimate s G and he in 1-pass by h and 
s using O (ffp) and O space, see Lemma 7 and Lemma 9 Setting T = n 2 / 5 gives us the 


desired space bound of O 


r, 4 /5 


(note that T > (16a/e) 2 which is required for Lemma 


9). For 


the second part of the theorem, we can improve the space requirements for the estimator h and 
s to O ky us j n g Lemma j^j and Lemma jsj Now, setting T = n 1 / 3 gives the desired space 

bound. 

Let OPT be the size of a maximum matching. First, we check whether OPT > 2 • 3T/(1 — e) 
by invoking the rank decision algorithm with parameter k = 3T/(1 — e). Since the rank of the 
matrix is equal to 2 OPT, this decides whether OPT > 2 • 3T/(1 — e). If this is not true, we can 
give a 2-approximation on OPT by testing whether the rank of the Tutte matrix is in [2*,2* +1 ) 
for i = 0,..., log (3T/(I — e)) — 1. If OPT > 2 • 3T/(1 — e) Lemma [b] implies that max{/i<j, sg} > 
3T/(1 — e) since he + s G > OPT. Assuming that we can approximate rna x{h G , sg} then again by 
Lemma [6] we can estimate OPT since 


ma yj{h G: s G } 
V 


< OPT < ha + sg < 2max{/}G, sg}- 


W.l.o.g. let h = argmax{/r, s}. Now we have two cases: 

1. If h,Q = argmax{/iG, sg} > T then by Lemma[7]/i is a (1 ± e) estimation on he- 

2. If sg = argmax{/iG) s g} > 3T/(1 — e) we know by Lemma|9]that s > 3 T which implies that 
h>'s> 3 T. Thus by Lemma |7J h is a (lie) estimation on he- This gives us 

(1 - e)sG < s < h < (1 + e)h G < (1 + e)sG- 


Therefore, max{/i, s} is a good estimator for max{/iG, sg}- F° r the estimator we have 

(1 ~e) npT < (1 - e) max{fc G , sg} maxjft, s} (1 + e) ma x{h G ,s G } 

2(l + e)r? - (l + £)r? “ (1 + e)r, ~ (1 + e)r? “ 

Q 


3 Lower Bound 

Esfandiari et al. m showed a space lower bound of Q(y/n) for any estimation better than 3/2. 
Their reduction (see below) uses the Boolean Hidden Matching Problem introduced by Bar-Yossef 
et al. [6], and further studied by Gavinsky et al. 02 ! We will use the following generalization due 
to Verbin and Yu [33]. 

Definition 3 (Boolean Hidden Hypermatching Problem [33]). In the Boolean Hidden Hypermatch¬ 
ing Problem BHHt } n Alice gets a vector x G {0, l} n with n = 2 kt and k G N and Bob gets a perfect 
t-hypermatching M on the n coordinates of x, i. e., each edge has exactly t coordinates, and a string 
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w E {0, 1}"A We denote the vector of length n/t given by (® x <i< t x Mlti , ■ • ■, ®i <i<t x M n/tii ) by 
Mx where (Mi.i,..., Mi^),..., (M n / tl ,..., M n / tt ) are the edges of M. The problem is to return 
1 if Mx © re = l n /* and 0 if Mx © w = 0 n /*, otherwise the algorithm may answer arbitrarily. 

Verbin and Yu |33| showed a lower bound of J)(n 1-1 /*) for the randomized one-way communication 
complexity for BHHt >n - For our reduction we require w = 0 n / f and thus Mx = l n /* or Mx = 0 n// *. 
We denote this problem by BHH® n . We can show that this does not reduce the communication 
complexity. 

Lemma 10. The communication complexity of BHH®± n is lower bounded by the communication 
complexity of BHH t n . 

Proof. First, let assume that t is odd. Let x E {0, l} n with n = 2 kt for some k E N and M be a 
perfect t -hypermatching on the n coordinates of x and w E {0, l} n ^. We define x' = [x T x T x T x T ] T 
to be the concatenation of two identical copies of x and two identical copies of the vector resulting 
from the bitwise negation of x. W.l.o.g. let {xi,... ,xt} E M be the Z-th hyperedge of M. Then 
we add the following four hyperedges to M 1 : 

• .. -,xi}, {xT,x 2 ,x^,.. .,xf}, {xl,x^,x 3 ,... ,x t }, and {xi,..., x t } if w t = 0, 

• {xl,x 2 , ■ • ■ ,X(}, {xi,X 2 ,... j x t }, {xi,x 2 ,x^, ■ ■ and {xl 5 ... ,xt} if w t = 1. 

The important observation here is that we flip even number of bits in the case wi = 0 and an odd 
number of bits if wi = 1 (since t is odd). Since every bit flip results in a change of the parity 
of the set of bits, the parity does not change if we flip an even number of bits and the parity 
also flips if we negate an odd number of bits. Therefore, if wi is the correct (respectively wrong) 
parity of {xi,..., x*} then the parity of the added sets is 0 (respectively 1), i. e., M’x’ = 0 2n if 
Mx © w = O' 1 / 2 and M'x' = l 2n if Mx © w = W 2 . The number of ones in x' E {0, l} 4n is exactly 
2 n. If t is even, we can just change the cases for the added edges such that we flip an even number 
of bits in the case wi = 0 and an odd number of bits if uy = 1. Overall, this shows that a lower 
bound for BHH t , n implies a lower bound for BHH^ An . □ 



Figure 1: Worst case instance for t = 3. Bob’s hypermatching corresponds to disjoint 3-cliques 
among the lower nodes and Alice’ input vector corresponds to the edges between upper 
and lower nodes. 


Theorem 2. Any randomized streaming algorithm that approximates the maximum matching size 
within a 1 + 3t ^_ x factor for t >2 needs ^(n 1-1 /*) space. 
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Proof. Let x, M be the input to the BHH^ n problem, i. e., M is a perfect t -hypermatching on the 
coordinates of x, x has exactly n/2 ones and it is promised that either Mx = 0 n / t or Mx = Y 1 ^. 
We construct the graph for the reduction as described above: For each bit Xi we have two nodes 
v\ t i, V 2 ,i and Alice adds the edge {uqj, V 2 t i} iff Xi = 1. For each edge {x,;,,..., Xi t } G M Bob adds a 
f-clique consisting of the nodes V 2 ,i 1 , ■ ■ ■ ,V2,i t - For now, let us assume t to be odd. We know that 
the matching is at least n/2 because x has exactly n/2 ones. Since Bob adds a clique for every 
edge it is always possible to match all (or all but one) nodes of the clique whose corresponding 
bit is 0. In the case of Mx = 0 n / t the parity of every edge is 0, i. e., the number of nodes whose 
corresponding bit is 1 is even. Let M-n C M be the hyperedges containing exactly 2 i one bits 
and define := |M 2 ,;|. Then we know n/2 = Yl\=o^ 2* ' an< ^ 1-^1 = n /t = J/!=o ^ 2 %■ For 
every edge in M 21 the size of the maximum matching within the corresponding subgraph is exactly 
2 i + [(t — 2z)/2j = 2i + [t/2j — i for every i = 0,..., [t/ 2j (see Fig. [I]). Thus, we have a matching 
of size 


E . , . ... n t — 1 n n 

(2i + (|_t/2J - i))l2i = — H — • — - ^ 


3 n n 

T ~ 2 1 ' 


If we have Mx = l n ^ then let C M be the hyperedges containing exactly 2 i + 1 one bits and 

define hi+i ■= \M 2 i+i\. Again, we know n/2 = J2i=o^ + 1) ‘ hi+i and \M\ = n/t = Yl\=^ hi+i- 
For every edge in Api+i the size of the maximum matching within the corresponding subgraph is 
exactly 2i + 1 + (t — 2i — l)/2 = 2i + 1 + [t/2\ — i for every i = 0, ..., [t/2\. Thus, the maximum 
matching has a size 


E . . -1 • * . x . Ti t — 1 Ti 1 v—r , \ Ti 3 Ti 

(2i + l + (\t/2\ - i))hi+i = ^ H • — - 2 (2* + -*-) ’ hi+i + — = —£■ 

z=0 i=0 

For t even, the size of the matching is 


t/2 

E . . , n t n n 3 n 

(2* + (t — 2*)/2 )l2i = w + x't — 7 = TT 


i=0 


if Mx = 0 n// h Otherwise, we have 


t/2 


i=0 


( 2z + 1 + 


t - 2% - 1 


1‘2 


n 


t/2 


i +1 - 


+ ^(t/2 — i — l)^2?+i 


»=0 


n . _. n n n 3n n 

=-(t/2 — 1)-1-=-. 

2 w ’ t 4 2t 4 2t 

As a consequence, every streaming algorithm that computes an a-approximation on the size of 
a maximum matching with 


a < 


(3/4)n 


= 1/(1 — 4/6*) = 1 + 


1 


((3/4) - l/(2t))n /v 11 3t/2 — 1 

can distinguish between Mx = 0 n and Mx = Y 1 ^ and, thus, needs fl(n 1-1 / 4 ) space. 
Finally, constructing the Tutte-matrix with randomly chosen entries gives us 


□ 


Corollary 2. Any randomized streaming algorithm that approximates rank(A) of A € M nxri within 
a 1 + 3t Ai factor for t > 2 requires ^(n 1-1 /*) space. 
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